Method for protecting digital content from unauthorized use by automatically and dynamically integrating a content-protection agent

ABSTRACT

A content processor application is loaded into memory from a master image to form a runtime content processor application image. An integration agent dynamically integrates a protection agent into the loaded runtime content processor application image to form a customized content processor application with extended functionality. Only the runtime content processor application image is extended with the protection agent—the application master image remains unaltered.

RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No.10/194,655, filed Jul. 11, 2002, which claims the benefit of U.S.Provisional Application No. 60/305,589, filed on Jul. 13, 2001.

The entire teachings of the above applications are incorporated hereinby reference.

BACKGROUND OF THE INVENTION

As more and more digital content is transacted electronically, there isan increasing demand for technologies that can secure the content fromunauthorized use and distribution. Unlike physical goods, digitalcontent is easily copied and distributed. The only way to prevent thisis for the content provider to establish a trusted environment on theend user's machine that can act as a proxy for securing the content fromillegal copying and distribution after it is shipped to an authorizedend user.

Cryptographic solutions such as Pretty Good Privacy (PGP) (availablefrom Network Associates) and RSA (available from RSA Security, Inc.),secure digital content during its transmission through an untrustedchannel, but are inadequate for securing it once it gets to the enduser's machine. In fact, the science of cryptography matured duringWorld War II as a means for protecting an untrusted communicationchannel between two parties that trust each other.

In the present case however, the content provider would generally preferto not have to trust the end user receiving the content, and so thesecurity of the content must continue to persist even after the digitalcontent has been received by the end user. Furthermore, end users whoreceive digital content would generally prefer to not be burdened withthe security concerns of the content provider simply because theyreceived the digital content. This “last mile” problem cannot beaddressed by cryptographic techniques alone, because they require theencrypted document to be converted to clear-text on disk before it canbe viewed or manipulated by an application on the end user's machine.

One way to establish a trusted end point on the end user's machine is toforce the end user to use a trusted piece of software, namely thecontent player application, to “play” or process the content. Thetrusted content player application should be capable of directlyprocessing the digital content in the encrypted format in which it isshipped, so that a decrypted or “clear text” form of the originalcontent is never created on disk.

Another solution is to create a security plugin module that can extendthe content player application with the desired security features.

SUMMARY OF THE INVENTION

What makes the “last mile” difficult to solve in the real world is thefact that most content publishers do not control the source code for thecontent player applications that process their content. For example,music publishers create audio content in formats such as MP3, Real Audioand Windows Media, but the applications that play audio files in theseformats are manufactured by software publishers, and not the musicpublishers.

Companies that offer digital content security or digital rightsmanagement solutions have to seek the cooperation of the softwarepublishers before they can sell their solutions to the contentpublishers. They have to partner with the software publisher to dosource-level integration of their security solution into the softwareapplication that will play the content, in order to create a trustedendpoint on the end user's machine for the content publisher. This makesmarket penetration for such approaches very difficult.

Even if the cooperation of software publishers can be successfullyobtained, it is still up to the end user to upgrade any existing versionof the software application to the custom version that has the contentprotector embedded in it. This creates an additional barrier todeployment, especially if the end user has to pay for the upgrade.

An example of such a situation recently appeared in connection with theencryption of PDF files using a digital rights management (DRM)solution. Adobe Systems Incorporated is a software company thatmanufactures a line of software applications, called Acrobat, based onthe PDF file format. It distributes both a limited-functionality AcrobatReader, which is free, and a full-featured Acrobat product, which costsseveral hundred dollars to purchase.

Adobe recently announced a partnership with the developer of the DRMsolution to integrate that solution into Adobe's line of Acrobatapplications. However, Adobe chose to integrate the DRM solution onlywith its full-featured product and not with the free reader.

From the perspective of a content publisher interested in securedistribution of PDF files, it would prefer to see the DRM solutionintegrated into both the free reader as well as the full-featuredAcrobat application. This is because the high cost of the full-featuredproduct creates a significant market barrier for the content publisher:an end-user receiving a secure PDF file would have to have the expensivefull-feature Acrobat product and not just the free reader.

The impact of this price differential can be clearly seen by comparingthe installed base of the full-featured version against that of the freereader; the installed base of the full-featured version is tiny comparedto that of the free reader. This highlights the fact that for a digitalcontent security solution to be easily deployable in the market, itshould be able to work with existing and legacy software applicationsthat can process the original content format.

Some software manufacturers, such as Microsoft Corporation, have takenthe initiative of integrating their own security solutions into thecontent players that they manufacture. They can then provide securityservices to the content publishers, instead of third-parties such as RSAthat are not themselves player software manufacturers.

Even this strategy has problems that concern content publishers.Consider the digital music market for example. The major musicpublishers are wary of using a proprietary security solution from onemanufacturer of a software music player, because it gives that softwaremanufacturer an unfair advantage in the market and locks the musicpublishers into that one software manufacturer. Furthermore, the problemof upgrading existing and legacy software players still remains.

In an enterprise setting, the problem of legacy software is especiallyacute. Enterprises typically upgrade software packages long after theupgrades are released, because of the potential disruption such upgradescan cause to the business. For example, many enterprises were stillusing Microsoft Office 97 in the year 2001, while in this same year,Microsoft prepared to launch its third major release of the Office Suitesince releasing Office 97. Though the newest Office Suite may havebuilt-in features for creating and handling encrypted Office filesdirectly, the older versions of the Office Suite still installed in someparts of the enterprise will not be able to interpret these encryptedfiles. In situations where a secure document needs to be exchangedacross enterprise boundaries, this can be particularly vexing.

With respect to plugin strategies, one problem is that they rely on theapplication to provide a plugin interface that is appropriate for such asolution. Many important applications do not provide such interfaces.For example, there is often a need for securing CAD files that containproprietary product details, but existing CAD packages that are widelydeployed do not provide a plugin interface that allows theimplementation of a security solution.

A further problem with the plugin solution is that it isapplication-specific. Thus, in the case of MP3 audio format, where thereare numerous players installed in the field, a separate plugin modulewould have to be developed for each player. When the end user upgrades aplayer, he is responsible for upgrading the plugin as well, assuming anupgraded plugin is readily available at that time.

U.S. Pat. No. 6,317,868, “Process for transparently enforcing protectiondomains and access control as well as auditing operations in softwarecomponents” by Grimm, et al., describes another technique for enforcingcontent protection transparently without requiring the cooperation ofthe content processing application vendor. Grimm is aimed at enforcingcontrols on the applications themselves, rather than the content filesthey process. Although it might be possible to extend Grimm to protectcontent files as well, Grimm does not address the difficulties of actualdeployment on a commercial scale.

For example, Grimm appears to require that the disk image of the contentprocessing application be modified prior to its execution. Thus, Grimmemploys a “″static integration” scheme, where protection functionalityis integrated with an application prior to execution time.

In contrast, the present invention is a dynamic integration scheme, withthe integration being repeated every time execution begins. The staticintegration scheme suffers from some several limitations.

For example, it is generally impossible to determine all of anapplication's dependencies (i.e., the required DLLs, other datastructures it uses at runtime, etc.) from a static analysis of theapplication binary. For instance, many Win32 applications use“LoadLibrary” to dynamically load certain libraries at execution time,making it very difficult to statically enforce any protection policy onsuch code.

In addition, many commercial applications invoke operating system DLLsthat are not part of the application itself, but nonetheless provideaccess to many system objects such as the file system. Modifying thedisk image of system DLLs can be catastrophic to the robustness of theentire operating system.

Furthermore, the DLLs of some commercial applications have a built-in“checksum” mechanism to detect tampering of their disk image. The toolchain that creates the DLL binary at the application vendor's siteembeds a checksum value in the DLL header. This checksum is computedusing an algorithm implemented by the operating system. For example,Win32 operating systems provide a “CheckSumMappedFile” system call thatcan produce the checksum number for a given DLL or EXE file, which maybe embedded into the header area of the DLL or EXE file. Thereafter,when the Win32 loader loads that DLL into memory on the end user'ssystem, it will compute the checksum itself and compare the result withthe checksum embedded in the header. If these two numbers do not match,the loader returns a failure and the application aborts. Because thechecksum depends on the bytes that make up the disk image of theoriginal DLL as shipped by the content processing application vendor,any modification of this disk image could cause a checksum comparisonfailure during loading.

By dynamically performing the integration, the present invention onlymodifies the already loaded memory image of the DLL, and never its diskimage, thus avoiding the above problems.

In addition, static modification of applications can create seriousadoption barriers for content publishers interested in distributingprotected content files to machines that are outside their jurisdiction.For example, a department within enterprise A wishing to send aprotected PDF file to a department within enterprise B cannot requireenterprise B to modify all installed copies of their Acrobat PDF readerapplication in order to view the protected PDF file.

U.S. Pat. No. 5,953,534, entitled “Environment manipulation forexecuting modified executable and dynamically-loaded library files,” toRomer, et al., describes a technique used to statically transform anapplication DLL or EXE file, such that the transformed version behavesthe same as the original, but allows features like instrumentation,security, auditing, etc., to be implemented transparently. This schemesuffers from the same problems as Grimm regarding static integration.Manipulation of import tables is performed statically. This implies thatthe import table entries have not yet been initialized by the operatingsystem linker, so that the import table can be replaced in its entiretyif so desired.

The present invention, on the other hand, only patches the loaded memoryimage of the import table after the linker has initialized the importtable's entries with target addresses. Furthermore, only the relevantentries that pertain to file I/O related calls need to be patched.

A final problem with static integration of a security policy with thecontent processor application is that it binds a single policy with theapplication. Thus, if two different content publishers A and B want toassociate two different content protection modules with their respectivedocuments, two different versions of the content processor applicationwill have to be created on the end user's machine. By using dynamicintegration, the present invention allows the same content processorapplication to be used for both.

In summary, the present invention allows any digital content protectionsolution to be deployed easily, without disrupting an existing installedbase of legacy applications, or preventing upgrades or replacement ofthese applications. It effectively disassociates the content protectionenforcement from the content processor application, thereby empoweringcontent publishers to use any content protection method of their choice,without tying it to a specific content processor application. It is alsocontent processor application “agnostic,” allowing a single solution towork across a variety of applications that may all be capable ofprocessing the same content format type. The invention also does notrely on the existence of a plugin interface in the application, allowingit to work even with future upgrades of the current application.

Accordingly, the invention method for extending a content processorapplication includes loading the content processor application intomemory from a master image to form a runtime content processorapplication image, and dynamically integrating a protection agent intothe loaded runtime content processor application image to form acustomized content processor application with extended functionality.Only the runtime content processor application image is extended withthe protection agent—the application master image remains unaltered.

The protection agent may comprise an amalgamator and one or more contentprotection modules. The protection agent is integrated into the runtimecontent processor application image by first injecting the amalgamatorinto the runtime content processor application image. The amalgamatorthen loads the content protection modules, and integrates the moduleswith the runtime content processor application image to provide theextended functionality. Such extended functionality may include, forexample, accessing protected content.

The protection agent preferably executes within the same address spaceas the customized content processor application, and is thus easily ableto support editing of protected content without the loss of protection,for example by intercepting I/O function calls, memory storage calls,cut/paste calls, etc.

Some content protection modules may be used to access protected content.

Content protection modules may be produced by a third-party. Pluralcontent protection modules may be simultaneously registered with theprotection agent, and may correspond to different content formats. Theymay be used in parallel by the customized content processor applicationto process different documents, or to process different portions of asingle document. Content protection modules may be used to prevent theexport, in an unprotected form, of at least a portion of the protectedcontent, for example, by causing all I/O operations that targetunprotected files/memory buffers to write out data in a protectedformat. Furthermore, content protection modules may be used to maintainan audit trail.

The customized content processor may process both protected andunprotected content. The extended functionality may include, forexample, content protection, rights management, andencryption/decryption.

Preferably, the protection agent is independent of the content processorapplication. That is, they may be independently developed, each with noprior knowledge of the other. Similarly, the protection agent may beindependent of any plugin interface provided by the content processorapplication.

The content processor application may be an existing/legacy application.

The dynamic integration of the protection agent with the contentprocessor application may be performed either in hardware, software, ora combination.

In one embodiment, the operating system boot process is modified so thatupon attempting to launch an application that serves as an interactiveshell, such as Microsoft's Windows Explorer, which can be used to launchother applications, instead an integration agent application islaunched. The integration agent then launches the intended application(e.g., Microsoft's Windows Explorer) and dynamically integrates it withthe protection agent.

In another embodiment, an end user may explicitly enable the automaticand dynamic integration of the protection agent into all subsequentlylaunched content processor applications.

The dynamic integration of the protection agent into the loadedapplication runtime memory image may be performed by an integrationagent, which may be a standalone software application. In oneembodiment, the integration agent may be associated with a file typecorresponding to at least one protected content document.

Integration of the protection agent into the runtime content processorapplication image may include identifying file I/O related operatingsystem calls that can be made by the application, and then overwritingthe identified file I/O related operating system calls to point tocorresponding functions which extend the functionality of the contentprocessor application. For example, file I/O related operating systemcalls may be identified by examining an import table associated with theruntime content processor application image. These system calls may thenbe intercepted by overwriting corresponding identified entries in theimport table. Alternatively, calls to functions that load additionalexecutable code, such as a dynamically linked library (DLL) module, intomemory may be identified and overwritten to point to correspondingfunctions contained within the protection agent.

In one embodiment, the steps of loading the application and integratingthe protection agent into the runtime content processor applicationimage are performed automatically and transparently (that is, withoutthe knowledge or active participation of the end user) when theapplication is selected for execution.

The integration agent may be registered in place of the contentprocessor application, so that the integration agent is executed whenthe application is selected. The integration agent then proceeds tointegrate the protection agent with (i.e., inject the protection agentinto) the runtime content processor application image.

A system for extending a content processor application according to anembodiment of the present invention, includes a loader, an integrationagent and a protection agent. The loader loads the content processorapplication into memory from a master image to form a runtime contentprocessor application image. The integration agent dynamicallyintegrates a protection agent into the loaded runtime content processorapplication image to form a customized content processor applicationwith extended functionality, only the runtime content processorapplication image being extended with the protection agent, leaving theapplication master image unaltered. The protection agent provides accessto protected content.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing will be apparent from the following more particulardescription of example embodiments of the invention, as illustrated inthe accompanying drawings in which like reference characters refer tothe same parts throughout the different views. The drawings are notnecessarily to scale, emphasis instead being placed upon illustratingembodiments of the present invention.

FIG. 1A is a high-level block diagram that illustrates the creation of aprotected document containing protected content.

FIG. 1B is a high-level block diagram illustrating the use or access ofthe protected content with an embodiment of the present invention.

FIG. 2 is a flowchart of the dynamic integration process as implementedby the integration agent of the present invention.

FIG. 3A is a flowchart illustrating details of step 203 of FIG. 2.

FIG. 3B is a schematic diagram illustrating the process of FIG. 3A.

FIG. 4 is a flowchart illustrating the amalgamator initializationroutine of step 309 of FIG. 3A.

FIG. 5 is a flowchart illustrating the steps performed by theintegration agent when it begins execution under one embodiment of thepresent invention.

DETAILED DESCRIPTION OF THE INVENTION

A description of example embodiments of the invention follows.

As used herein, a “content protection module” (or protection module, forshort) is a software module which provides clear-text access tocipher-text content for only authorized users. Further, a “contentprocessor application” (or content processor) is any softwareapplication that supports the viewing and/or editing of clear-textcontent files. Also, a “protected content” is an encrypted cipher-textfile (possibly containing additional information required forauthentication), which the content protection module knows how todecrypt and interpret.

The content protection module may also be implemented using acombination of software and hardware. In any case, the softwarecomponent, possibly backed by hardware support, must include the “main”or “driver” portion because it is integrated into the software of thecontent processor application and gets control before the contentprocessor application.

FIG. 1A is a high-level block diagram that illustrates the creation of aprotected document. The original (clear-text) document 101 is encryptedat 105 by the content protection module 103, to form a protecteddocument (cipher-text) 107. In one embodiment, the protected document107 comprises an encrypted header area 109 and the protected content111.

FIG. 1B is a high-level block diagram illustrating the use or access ofthe protected document 107 and hence protected content 111 with anembodiment of the present invention. To view or edit the protecteddocument 111, the content processor application 121 must be integratedwith an appropriate content protection module 103 to translate ordecrypt the document contents into a clear-text version 123 of theoriginal document 101. The existence of the content protection module103 is transparent to the content processor application 121.

The present invention does not depend upon the specific contentprotection algorithm used by the content protection module 103 (e.g.,for encryption or authentication); any desired method may be integratedwith a new or legacy content processor application 121 without requiringany access to source code of the content processor application, or anycooperation from the vendor of that application. The focus is instead onthe integration of a protection scheme with the content processorapplication 121, and not on the particular technology used for contentprotection (encryption, authentication, etc.).

An embodiment of the present invention consists of an “integrationagent”, which may be a stand-alone application, and an “amalgamator”module that automatically integrates one or more content protectionmodules 103 with any existing or legacy content processor application121 that is capable of processing the original clear-text form of thecontent/document.

Integration is performed dynamically when the content processorapplication 121 (FIG. 1B) is launched, and is performed only on theimage of the content processor application loaded into memory. Thecontent processor application 121 itself is never modified on disk, andno source-level integration or plugin interface is required. Applicantsthus refer to the memory image of the content processor applicationmodified by such a process as a “dynamically customized contentprocessor”, or more simply, a “customized content processor” 121A. Thus,a content protection module 103 may be automatically and dynamicallyintegrated with any existing content processor application 121 that isused to process a content file.

For example, if the content processor application 121 is Microsoft Wordand the original content file 101 is a Word document, then the presentinvention enables a Word document to be transferred and stored as aprotected cipher-text document 107 (which need not be in a format thatis compatible with the Microsoft Word application), while an existingMicrosoft Word application 121 can process it as if it were a regularclear-text Word file.

When the Microsoft Word application 121 is launched, the dynamicintegration process of the present invention automatically integrates anappropriate content protection module 103 with the Microsoft Wordapplication 121 in memory. Thereafter, whenever the executing Wordapplication 121 accesses the encrypted Word document 107, the contentprotection module 103 is invoked automatically and transparently,dynamically authenticating the access rights and decrypting the portionof the protected document that is being accessed.

This dynamic integration occurs only on the memory image of the contentprocessor application 121, not in its disk image. Thus, an installedcontent processor application 121 is never modified on disk, and theuser's experience of working with the document is unchanged, unless someauthentication or authorization check fails.

In one embodiment, the integration agent application and the amalgamatormodule are small enough that they can be shipped with the protecteddocument 107, so that the end user can install them if they are notalready installed on the user's computer.

FIG. 2 is a flowchart illustrating the dynamic integration process asimplemented by the integration agent 200, once installed, of anembodiment of the present invention.

The process begins, at step 201, when the user indicates, bydouble-clicking or through other means, a desire to launch a contentprocessor application 121 either explicitly, for example, by clicking onthe application itself, or implicitly, for example, by clicking on thecontent to which access is desired.

At step 202, the integration agent 200 intercedes and launches thecontent processor application 121 in a suspended state. Launching thecontent processor application 121 in this manner ensures that no part ofthe application's execution occurs outside the control of theintegration agent 200 or the content protection module 103.

At step 203, the integration agent 200 begins the dynamic integration ofthe amalgamator module and content protection modules 103 with thecontent processor application 121. As part of this step, the requiredcontent protection modules 103 are loaded and the memory image of thecontent processor application 121 is modified so that the contentprotection modules 103 are tightly integrated into the application,producing the customized content processor 121A. FIG. 3A, discussedbelow, provides further details of step 203.

After receiving confirmation from the amalgamator that the dynamicintegration (or injection) has succeeded, the integration agent 200terminates (step 204).

At step 205, execution of the customized content processor application121A, with its modified capabilities, is resumed. The customized contentprocessor application 121A processes the protected content as direct bythe user. Step 205 continues to process until some exit condition isindicated, as at step 206.

On the other hand, if at any time, any authorization failure isdetected, then at step 207, some security-specific appropriate actionmay be triggered. For example, a pop-up message may appear, and/or theexecution of the customized content processor application 121A mayterminate.

The system is flexible in that any number of content protection modules103 can be registered for the same type of document. Thus, for example,different content publishers can, if they prefer, register differentcontent protection modules at different times or even at the same time,for a single document type. Even a single content publisher couldsimilarly register different content protection modules 103 for a singledocument type.

In addition, multiple content protection modules 103 may be integratedinto a single content processor application 121, so that the resultingcustomized content processor application 121A can handle many differentprotection schemes. Depending on the protected content file beingaccessed, the appropriate content protection module 103 mayautomatically be invoked.

The same content publisher may associate different content protectionmodules 103 with different protected content files, even if all of thefiles will be processed by the same customized content processorapplication 121A.

Furthermore, different protected content files requiring differentcontent protection modules 103 may be processed simultaneously by acustomized content processor application 121A that is capable ofprocessing several content files simultaneously. For example, a user mayuse a content processor application, such as Microsoft Word, to openmultiple windows simultaneously, each window containing a protecteddocument 107. In an embodiment of the present invention, the customizedcontent processor application 121A may invoke the appropriate contentprotection module 103 for each protected Word document 107.

Furthermore, the customized content processor 121A can simultaneouslyprocess protected content as well as clear-text (i.e., unprotected)content.

Because content protection modules 103 execute within the address spaceof the customized content processor application 121A, editing ofprotected content may be supported. For example, the content protectionmodule 103 may include support for decrypting the protected content 111to permit the customized content processor 121A to edit it, as well asencryption support for protecting the content upon a save to disk or anyother mechanism by which protected data is extracted from the addressspace of the customized content processor 121A.

For example, an appropriately authorized user can open a protecteddocument 107 (e.g., using Microsoft Word), edit that document, and thencut text out of that document or save that document to another filesystem under a new filename. The exported data remains protected withthe security attributes of the original file (i.e., the security statusof the exported portions of the protected content 111 do not change).Furthermore, this entire protection process can occur without theknowledge of the appropriately authorized user.

Thus, the invention enables a content publisher to use a custom contentencryption or digital rights management format for shipping documents,and have its corresponding custom security solution or authenticator beautomatically integrated with whatever content processing application121 exists on the end user's machine to produce a customized contentprocessor 121A. Furthermore, because the integration is done dynamicallywithout altering the disk image of the content processing application121, any upgrading or replacement of the content processor application121 does not affect the ability of the authorized end user to processthe protected document. Finally, because the processing of protectedcontent 111 is integrated into applications only when needed at runtime, any upgrade or change to algorithms used in the content protectionmodules 103 can be deployed without the need to redeploy the contentprocessor applications 121.

Next described is an embodiment of the invention in the context of theWindows operating system (specifically a Win32 system such as Windows2000 and Windows XP) and a Win32 executable application (e.g., MicrosoftWord). Although this description focuses on specific applications,executable formats and operating systems, those of ordinary skill in theart will understand that the scope of the invention is not intended tobe limited in any way by this particular example.

A Win32 application typically consists of an executable (EXE) file andseveral, separately compiled, dynamically linked library (DLL) files.These files contain the binary code for the functions that compose theapplication. Some of the DLLs may be part of the Win32 operating systemlibrary, and not part of the application itself.

Launching and running an application involves some existing process onthe computing system requesting that a new process be created and loadedwith the memory image of the EXE file. Under Windows, “Explorer.exe” (orExplorer, for short) is this existing process; it provides the desktopuser interface (UI) familiar to users of Windows. In addition, the“CreateProcess” function in Win32 is the interface point that invokesthe Win32 loader for launching a Win32 application.

Functions defined external to but referenced within an EXE or a givenDLL file are listed in a special area of the file called the “importtable”. The import table contains a unique entry for each externalfunction, and is initialized by an operating system utility called the“loader” to contain the actual target address in memory for thatfunction. At runtime, the Win32 loader first loads the EXE file intomemory, then examines the EXE file's import table for DLL files that itcould reference during the course of execution. Finally, the loaderloads each of the DLL files in turn.

Each time a new DLL is loaded, the loader repeats this sequence to loadother DLL files that this one may require during the course ofexecution. The process completes when all DLL dependencies have beenresolved, that is, all referenced DLLs have been loaded into memory andinitialized. The loader then causes control to jump to the entry pointof the application in the EXE memory image, upon which the applicationbegins executing.

In other executable formats, a similar table-like structure may providethe information necessary to allow the system loader to initialize theexecutables (EXEs in Win32) and shared libraries (DLLs in Win32) so thatcontrol can flow between these separately-compiled modules.

It is also possible for an application to specify that a DLL should beloaded and initialized by the Win32 loader. The “LoadLibrary” functionin Win32 provides such a capability. Some systems also provide for thedelayed loading of DLLs, where the actual loading of a DLL does notoccur until the code attempts to transfer control to a functioncontained within that DLL.

In an embodiment of the present invention, the process described aboveis modified with respect to launching and running of an EXE, in a mannerthat involves no additional efforts by the user wishing to run the EXEor by the vendors that provided the EXE and associated DLL files. Nextis explained how the dynamic integration process of an embodiment of thepresent invention modifies the process of launching and running an EXEto achieve the desired goal of such automatic and user-transparentcontent protection using given content processor applications.

The first step is to achieve the launching of the content processorapplication 121 for processing the protected content file 107 undercontrol of the integration agent 200. There are a number of differentways to accomplish this, two of which are now discussed.

As a first example, the integration agent 200 may be registered as theapplication associated with the file type corresponding to protectedcontent documents 107. For example, assume that a protected documentwill always have the file extension “.CTL”. Then by registering theintegration agent 200 (at integration agent installation time) as theapplication associated with the “.CTL” file type, an attempt to open a“.CTL” file will automatically cause the integration agent 200 to beinvoked by the operating system. The “.CTL” file selected by the userwill then be passed to the integration agent application as an inputparameter by the operating system.

FIG. 5 is a flowchart illustrating, for this example, the stepsperformed by the integration agent 200 of an embodiment of the presentinvention when it begins execution. Steps 501-503 expand on step 202 ofFIG. 2.

First (step 501), the integration agent 200 checks the header area 109(FIG. 1A) of the input protected document 107 to determine the originalfile type of the document (e.g., “.DOC” for a Word document). The“header area” is defined as part of the protected content file format,and contains, among other things, information about the original contentfile (such as its size, file type, etc), and authentication informationthat will be read by the content protection module 103. The actualformat of the header area is dependent on the implementation of thecontent protection module 103, which is outside the scope of thisinvention.

Next, at step 502, the integration agent 200 looks in the Windowsregistry for the content processor application 121 that is currentlyregistered to handle documents of the original file type. For example,this may be some version of Microsoft Word.

Next, at step 503, the integration agent 200 launches the registeredcontent processor application 121 in suspended mode. This is possible inWin32 via the “CreateProcess” system call, which invokes the Win32loader to load the .EXE executable file and all of its dependent DLLsinto memory. By launching the application 121 in suspended mode, theintegration agent 200 regains control after the application 121 isloaded into memory, but before it starts execution.

As another example of launching, the dynamic integration method maymodify the behavior of, say, the Windows Explorer process. The purposeof the modification is to customize Windows Explorer using the dynamicintegration method so that it acts as the integration agent whenlaunching content processor applications 121. Here, the integrationprocess modifies the Explorer so that this process's control flow isdirected into a content protection module 103 before execution of the“CreateProcess” call that is used to launch any application in responseto a user's interaction with the Windows desktop UI. The code in theprotection module 103 may launch the content processor application 121(Microsoft Word) in suspended mode via its own invocation of the“CreateProcess” system call. As above, the integration agent 200 (thistime as a module within Windows Explorer and not as a standaloneapplication) gains control after loading but before the contentprocessor application 121 starts its execution.

An astute reader will realize that the problem of gaining control of thecontent processor application 121 has simply been changed into a problemof gaining control of the launching of the Windows Explorer process.Again, several options present themselves, two of which are nowpresented.

One solution is to modify the operating system boot process so that thelaunching of Windows Explorer is replaced with the launching of theintegration agent application whose sole task is to launch anddynamically inject Windows Explorer.

Alternatively, a system may be implemented in which the end userexplicitly enables the automatic and dynamic integration of contentprotection modules 103 into all subsequently launched content processorapplications 121. Such a single explicit action may be acceptable anddesirable in some end-user situations, and it still provides for theautomatic integration of the protection modules 103 with the actualprocessor applications 121. To achieve such an approach, the WindowsHooks facility supported by the “SetWindowsHookEx” functionality inWin32 may be used to cause Windows to inject the integration agent 200as a DLL into the address space of the running process that is theWindows Explorer. The procedure specified by the second parameter to the“SetWindowsHookEx” function identifies a procedure within theintegration agent DLL, and this procedure implements the work done bythe integration agent 200 after it has gained control of the contentprocessor application 121, as described below.

Once the integration agent 200 has control of the content processorapplication 121, the next step is to inject the amalgamator module 72(FIG. 3B) into the address space of the content processor and direct thecontent processor application's control flow into this module. Once theamalgamator module has control, the required content protection modules103 can be loaded, and the memory image of the content processorapplication 121 can be modified so that the content protection modules103 are tightly integrated into the application.

A sequence of steps is followed that yields a solution that is broadlyapplicable across the entire range of programmable computing systems.Broad applicability is achieved by relying on only a small set ofcapabilities that can be found on almost any programmable computingsystem. In particular, the approach followed by an embodiment of thepresent invention requires the capability and permission for one processto read and write its own or another process's address space.

FIG. 3A is a flowchart illustrating details of step 203 of FIG. 2. Thesedetails include the steps by which the integration agent 200 carefullyworks the amalgamator module 72 into the code space of the contentprocessor, even though the content processor 121 was never designed toload the amalgamator 72 or content protection modules 103.

FIG. 3A describes the interaction between the running process (i.e., theintegration agent 200) and the suspended process (i.e., the contentprocessor application 121). The integration agent 200 knows the memoryaddress at which execution of the content processor application has beensuspended and where to find the amalgamator 72 and content protectionmodules 103.

FIG. 3B is a schematic diagram of the process of FIG. 3A and isdiscussed in parallel below.

With reference to the top of FIG. 3B, when a user attempts to launch anapplication (i.e., a content processor application 121), for example byattempting to open the application itself or a document (content)associated with the application, the integration agent 52 is insteadlaunched. The integration agent 52 then causes the original contentprocessor application 121 to be loaded from a master image such as adisk image 50 to a memory image 54. (This corresponds to steps 201 and202 of FIG. 2.)

In step 301 (FIG. 3A), the integration agent 52 specializes a small codetemplate 58, (the “integrator generator template”). The particularactions taken during this specialization involve setting of instructionimmediates and address offsets that depend upon certain memory image 54values such as the instruction address in the content processor'saddress space where the processor application is to resume execution(called the start address), the address of the “LoadLibrary” function,and the location of the amalgamator module 72. The result of thisspecialization is a sequence of code and data bytes called the“integrator generator” 60, which includes an “integrator” 62.

Next, in step 302 of FIG. 3A, the integration agent creates a “sharedbyte” store 56 that acts both as a communication structure between theintegration agent 52 and the amalgamator module 72, and as a temporarystore for information. The integration agent 52 copies a portion 54A ofthe content processor application 54 code into the store 56, starting atthe content processor application's start address and continuing until anumber of bytes equal to the size of the integrator generator 60/62 havebeen copied. This copy operation can be accomplished, for example, usingthe Win32 “ReadProcessMemory” function. The store 56 may also be writtenwith some meta-data that specifies the size of the copied code and itsoriginal starting address, among other things. The shared byte store 56may be created, for example, as a named, memory-mapped file.

Next, in step 303, the integration agent 52 writes the integratorgenerator 60/62 into the address space of the content processorapplication starting at the start address, for example using the Win32“WriteProcessMemory” function, so that the memory image is now asappears at 64. On some systems (including non-Win32 systems), this andpossibly other steps may require temporary manipulation of the virtualmemory page protection bits to enable reading and writing of the pagescontaining the referenced code.

In step 304, the integration agent 52 “resumes” execution of the contentprocessor application 64 at the start address, for example using theWin32 “ResumeThread” function. Since first instructions of theintegrator generator 60 now reside at that start address, control in thecontent processor flows to the integrator generator 60. Note that theintegrator generator 60 transfers control to the integrator 62 in such amanner that the control flow never has to return to the integratorgenerator. This is done so that the amalgamator module 72 can restorethe application's code originally stored at the start address, thusremoving the code for the integrator generator.

Next, in step 305, the integrator generator 60 first saves a portion ofthe content processor application's state so that resources such asregisters can be temporarily used without losing the state of theapplication. Since the state of the program stack is known, space on thestack can be allocated to save the application's state. Alternatively,other temporary storage (e.g., the shared byte store) could be used.

Next, in step 306, the integrator generator 60 identifies a part of theapplication's address space free of code or data and creates a codecache 68 in this space. The integrator generator 60 writes a sequence ofcode and data, called the “integrator” 62, into this code cache. Fromthis cache, calls can safely be made to load and initialize theamalgamator module 72. The image of working memory is now as appears at66 in FIG. 3B.

Note that the cache 68 is deallocated when the application resumesexecution. Such a code cache may be built, for example, by using thenext set of free space on the program stack.

Next, in step 307 (FIG. 3A), once the integrator generator 60 hascreated the integrator 62, it unconditionally jumps to the firstinstruction in the integrator 62.

The application's code may now be restored as originally found at thestart address. In step 308, the integrator 62 loads the amalgamatormodule 72, so that the memory image is now as appears at 70 in FIG. 3B.Under Win32, this can be done with a call to “LoadLibrary”, with themodule name recorded during specialization of the integrator generatortemplate 58.

In step 309 (FIG. 3A), once the amalgamator module 72 has been loaded,control returns to the integrator 62, which then makes a call to aninitialization routine in the amalgamator module 72. The details of thisinitialization routine 309 are explained below, with respect to FIG. 4.Under Win32, a loaded module is given the opportunity to initializeitself by placing some code in the stylized “DllMain” function requiredin all Win32 DLLs. Applicants have found that it is better to have theintegrator 62 make a call to a separate initialization routine run afterthe completion of the “LoadLibrary” call, since only a small set of theapplication's and operating system's functionality is available within“DllMain”.

Finally, in step 310, once the amalgamator's 72 initialization routinehas completed, control again returns to the integrator 62. Part of theamalgamator's initialization process 309 involves restoring of thecontent processor application's code originally found at the startaddress, from the store 56. The integrator 62 can now deallocate thespace for the code cache 68 and unconditionally jump to the startaddress, thus returning control to the now customized content processor.

Again, there are many methods that can accomplish this. Somearchitectures like the Intel x86 provide a return instruction thatsimultaneously deallocates a block of space on the program stack. If noarchitectural mechanism exists for atomically deallocating space andnon-trivially changing the program counter, an alternative would be toleave a small amount of code cache space in the content processorapplication, enough to deallocate the larger code cache and returncontrol to the start address.

The initialization routine 309 (FIG. 4) in the amalgamator module 72 isprimarily responsible for loading the content protection modules 76 andtightly integrating them into the memory image of the content processorapplication to produce the customized content processor, so that thememory image finally appears as at 74 in FIG. 3B. The amalgamator 72 andcontent protection modules 76 are collectively called the “protectionagent”.

This integration process, however, does not stop once the contentprocessor application begins running. Other events, such as the delayloading of a DLL or the explicit loading of a DLL by code in the contentprocessor, may require that some portion of the integration process runagain to ensure that the content protection modules are properlyintegrated with the current state of the content processor. Similarly,the launching of another content processor application by this contentprocessor requires the amalgamator 72 to act as an integration agent andpropagate itself as described above to this new content processor.

In general, the integration process as performed by the amalgamatormodule 72 begins in its initialization routine 309 and proceeds as shownin the flowchart in FIG. 4. Recall that this integration process beginsas part of step 309 of FIG. 3A, once the content processor application121 (that is, its EXE and dependent DLL files) and amalgamator module 72(but not the content protection modules) have been loaded.

Referring to FIG. 4, at step 401, the amalgamator 72 determines the typeand location of the content protection modules to load and loads them.The first part of this step may be accomplished, for example, by havingthe amalgamator 72 read part of the header area of a protected contentfile 107 to determine the associated content protection module 103 forthe file. Alternatively, the integration agent 52 may have specified thelocation of the content protection modules to load by encoding thatinformation in the shared byte store 56.

A content protection module can be shipped separately by the contentpublisher, or some or all of it can optionally be embedded into theprotected content file itself. In all cases, the entire contents of eachprotection module are copied into the address space of the contentprocessor application.

At step 402, the amalgamator 72 identifies each module (EXE or DLL) thatis part of the content processor application's loaded memory image.Using this list, the amalgamator performs steps 403 and 404.

At step 403, the amalgamator 72 examines the module to identify any fileI/O-related operating system calls that can be made by the contentprocessor application while executing this module. I/O functions, orsystem calls, are those involved in input and output to the file system,including operations such as read and write, as well as cut and pasteoperations. Such an analysis may be performed, for example, by theamalgamator 72 examining the import table of the module looking foraddresses of known file I/O-related functions.

At step 404, the identified calls are rewritten by the amalgamator 72 sothat control flows not to the I/O related function on such calls but toa corresponding function defined in a content protection module. Again,there are many ways to accomplish this redirecting of control flow.

One such method that is appropriate for the example given above involvesthe replacement of the I/O-related function call entries in the module'simport table with the addresses of corresponding functions defined inthe content protection module. The original address inserted by theWin32 loader into that import table entry is also noted in a separatetable that is accessible by the content protection module code. This“patching” of the import table ensures that when the application makesthe I/O call, the content protection module gets control first, allowingit to perform any authentication or decryption actions beforeredirecting the call to the original function address.

Other methods exist for achieving this redirection of control that couldbe used in alternative embodiments of the present invention. An earlyarticle by Peter Kessler (Peter B. Kessler, “Fast breakpoints: Designand implementation,” Proceedings of the ACM SIGPLAN'90 Conference onProgramming Language Design and Implementation (PLDI), pages 78-84,White Plains, N.Y., 20-22 Jun. 1990. SIGPLAN Notices 25(6), June 1990)describes the basic mechanisms and issues involved in patching code forthe purpose of control flow redirection. A more recent article by GalenHunt and Doug Brubacher (Galen Hunt and Doug Brubacher, “Detours: BinaryInterception of Win32 Functions”, Proceedings of the 1999 Usenix WindowsNT Symposium, USENIX Association, 1999) describes such a systemexplicitly for the binary interception of Win32 functions.

Any function calls used to explicitly load another DLL or executableinto memory (such as the “LoadLibrary” call in the Windows operatingsystem) are also handled by steps 402 and 403 as described above. Forexample, a call to “LoadLibrary” by customized content processorapplication during its execution will cause the amalgamator 72 toexecute steps 403 and 404 for each newly loaded DLL. This ensures thatany DLL or executable that is explicitly loaded by the content processorapplication at execution time will also have its import table entriespatched appropriately.

It is important to note that a facility is provided by the amalgamator72 for the routines in the amalgamator and content protection modules toaccess the functions protected in steps 403 and 404 above. For example,the “patching” mentioned above occurs only on import tables of modulesbelonging to the original application and not to modules associated withthe amalgamator or the content protection modules.

Finally, in step 405, the initialization routine 309 of the amalgamatoraccesses the shared byte store 56, replaces the code and data forintegrator generator with the content processor's original applicationcode, and then in step 406 signals to the integration agent 63 that itis done with the shared byte store 56. As described in step 309 of FIG.3A, the initialization routine 309 of the amalgamator returns so thatthe suspended application can resume execution. When the integrationagent 52 receives notification from the amalgamator 72 that injection iscomplete, it terminates (or returns control to the Windows Explorer forthe example scenario).

What is left is a single content processor application process 74 (FIG.3B), which to the end user appears no different from the originalcontent processor application had it been launched normally. In actualfact, the executing content processor application has one or moretightly integrated content protection modules embedded in it.

To the end user, this entire process is transparent. It appears asthough double-clicking on (or otherwise selecting) a protected content“.DOC” file directly invoked the content processor application. Assumingno authentication check fails in the protection module, the protectedcontent looks no different to the content processor application than theequivalent clear-text “.DOC” Word file—all viewing and editingoperations work normally.

A content protection module 103 can prevent the export of any part ofthe protected document by simply ensuring that any I/O operations thattarget an unprotected file or memory buffer only write out encrypteddata. That is, “cut/paste” operations can be prevented from being usedto copy the contents of the protected document to an unprotecteddocument. Thus, the protection associated with the document (content)continues to persist, independent of the content processor applicationused to process it.

The invention provides a very practical way to deploy digital contentprotection solutions in the market. Consider a secure enterprise emailsolution as an example to illustrate this point. Many current secureemail solutions allow an email sender to encrypt an outgoing email, andrequire the recipient to connect to a trusted server to download anauthorization key that will allow the recipient user to decrypt themessage. The message itself is never decrypted to disk, so the cleartext form of the original message never persists on the recipient'smachine.

However, the problem with existing solutions arises in the case ofattachments that are sent with the encrypted message. Attachments canalso be encrypted using the same key, but unless the application used toread the attachment on the recipient's machine understands the contentencryption format, the attachment cannot be read directly.

Thus, if the attachment is a Word document that was encrypted on send,the attachment has to be first decrypted back to the original Worddocument on the recipient's machine before the Microsoft Wordapplication on the recipient's machine can read it. Unfortunately thiscreates a security hole, since the original Word document now persistsin clear-text form on the recipient's machine, allowing the recipient tocopy it illegally, or distribute it to unauthorized persons in anunprotected (unencrypted) form.

With an embodiment of the present invention, when the recipient opensthe encrypted attachment, the integration agent is invoked transparently(to the recipient), launching the Microsoft Word application anddynamically integrating the appropriate content protection module withthe application in memory. Finally, the modified application may readthe protected Word attachment directly.

Because the content protection module intercepts all I/O traffic betweenthe Microsoft Word application and the content file, the fact that theattachment is an encrypted Word document is also transparent to the Wordapplication. The end user never experiences the integration agent or theintegration process, and is unaware that the attachment is actually anencrypted document, unless he or she attempts to make an unauthorizedaccess.

When the content protection module discovers an unauthorized access, itcan display a message indicating authorization failure, and terminatethe Microsoft Word application. The content protection module could alsobe used to automatically maintain an audit trail.

Those of ordinary skill in the art should recognize that methodsinvolved in protecting digital content from unauthorized use byautomatically and dynamically integrating a content-protection agent maybe embodied in a computer program product that includes a computerusable medium. For example, such a computer usable medium can include areadable memory device, such as a solid state memory device, a harddrive device, a CD-ROM, a DVD-ROM, or a computer diskette, having storedcomputer-readable program code segments. The computer readable mediumcan also include a communications or transmission medium, such as a busor a communications link, either optical, wired, or wireless, carryingprogram code segments as digital or analog data signals.

While the system has been particularly shown and described withreferences to particular embodiments, it will be understood by those ofordinary skill in the art that various changes in form and details maybe made without departing from the scope of the invention encompassed bythe appended claims. For example, the methods of the invention can beapplied to various environments, and are not limited to the describedenvironment.

While this invention has been particularly shown and described withreferences to example embodiments thereof, it will be understood bythose skilled in the art that various changes in form and details may bemade therein without departing from the scope of the inventionencompassed by the appended claims.

1. A method for extending a content processor application, comprising:loading a content processor application into memory from a master imageto form a runtime content processor application image; suspendingexecution of the runtime content processor application image;dynamically integrating a protection agent into the runtime contentprocessor application image to form a customized content processorapplication with extended functionality by (i) identifying fileinput/output related operating system calls of the runtime contentprocessor application image that can be made by the application, and(ii) overwriting the identified file input/output related operatingsystem calls of the runtime content processor application image to pointto corresponding functions which extend functionality, only the runtimecontent processor application image being altered and extended with theprotection agent, the master image being unaltered; and resumingexecution of the customized runtime content processor application image.